{
 "cells": [
  {
   "cell_type": "markdown",
   "source": [
    "###Dataset\r\n",
    "This specific dataset seperates flowers into 3 different classes of species.\r\n",
    "- Setosa\r\n",
    "- Versicolor\r\n",
    "- Virginica\r\n",
    "\r\n",
    "The information about each flower is the following.\r\n",
    "- sepal length\r\n",
    "- sepal width\r\n",
    "- petal length\r\n",
    "- petal width"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "from __future__ import absolute_import, division, print_function, unicode_literals\r\n",
    "\r\n",
    "\r\n",
    "import tensorflow as tf\r\n",
    "\r\n",
    "import pandas as pd"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "CSV_COLUMN_NAMES = ['SepalLength', 'SepalWidth', 'PetalLength', 'PetalWidth', 'Species']\r\n",
    "SPECIES = ['Setosa', 'Versicolor', 'Virginica']"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "train_path = tf.keras.utils.get_file(\r\n",
    "    \"iris_training.csv\", \"https://storage.googleapis.com/download.tensorflow.org/data/iris_training.csv\")\r\n",
    "test_path = tf.keras.utils.get_file(\r\n",
    "    \"iris_test.csv\", \"https://storage.googleapis.com/download.tensorflow.org/data/iris_test.csv\")\r\n",
    "\r\n",
    "train = pd.read_csv(train_path, names=CSV_COLUMN_NAMES, header=0)\r\n",
    "test = pd.read_csv(test_path, names=CSV_COLUMN_NAMES, header=0)"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "train.head()"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "train_y = train.pop('Species')\r\n",
    "test_y = test.pop('Species')\r\n",
    "train.head() "
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "def input_fn(features, labels, training=True, batch_size=256):\r\n",
    "    # Convert the inputs to a Dataset.\r\n",
    "    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))\r\n",
    "\r\n",
    "    # Shuffle and repeat if you are in training mode.\r\n",
    "    if training:\r\n",
    "        dataset = dataset.shuffle(1000).repeat()\r\n",
    "    \r\n",
    "    return dataset.batch(batch_size)"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "my_feature_columns = []\r\n",
    "for key in train.keys():\r\n",
    "    my_feature_columns.append(tf.feature_column.numeric_column(key=key))\r\n",
    "print(my_feature_columns)"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "###Building the Model\r\n",
    " And now we are ready to choose a model. For classification tasks there are variety of different estimators/models that we can pick from. Some options are listed below.\r\n",
    "- ```DNNClassifier``` (Deep Neural Network)\r\n",
    "- ```LinearClassifier```\r\n",
    "\r\n",
    "We can choose either model but the DNN seems to be the best choice. This is because we may not be able to find a linear coorespondence in our data. \r\n",
    "\r\n",
    "So let's build a model!"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "# Build a DNN with 2 hidden layers with 30 and 10 hidden nodes each.\r\n",
    "classifier = tf.estimator.DNNClassifier(\r\n",
    "    feature_columns=my_feature_columns,\r\n",
    "    # Two hidden layers of 30 and 10 nodes respectively.\r\n",
    "    hidden_units=[30, 10],\r\n",
    "    # The model must choose between 3 classes.\r\n",
    "    n_classes=3)"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "###Training"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "classifier.train(\r\n",
    "    input_fn=lambda: input_fn(train, train_y, training=True),\r\n",
    "    steps=5000)"
   ],
   "outputs": [],
   "metadata": {}
  },
  {
   "cell_type": "markdown",
   "source": [
    "###Evaluation"
   ],
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "source": [
    "eval_result = classifier.evaluate(\r\n",
    "    input_fn=lambda: input_fn(test, test_y, training=False))\r\n",
    "\r\n",
    "print('\\nTest set accuracy: {accuracy:0.3f}\\n'.format(**eval_result))"
   ],
   "outputs": [],
   "metadata": {}
  }
 ],
 "metadata": {
  "orig_nbformat": 4,
  "language_info": {
   "name": "python",
   "version": "3.9.6",
   "mimetype": "text/x-python",
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "pygments_lexer": "ipython3",
   "nbconvert_exporter": "python",
   "file_extension": ".py"
  },
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3.9.6 64-bit"
  },
  "interpreter": {
   "hash": "f090ed6cf071543b1cae4b626fdd34eac9848e74057a248c66d6f1f1665c5e6f"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}